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SUMMARY 


This report presents a new automatic processing technique for unsuper- 
vised classifications (or clustering) for multispectral remote sensing data. 
This technique has been implemented into a digital computer program. Appli- 
cations of the computer program for actual multispectral scanner data from 
the aircraft survey will also be presented. 

Up to the present, main approaches are based on supervised maximum 
likelihood classification techniques which require reference spectral target 
signatures from training areas on the ground. One of the most serious draw- 
backs of the supervised classification techniques is associated with the high 
variability of the spectral signatures. 

The unsupervised classification technique avoids the above drawback by 
not requiring the reference signatures. Essentially, the technique will group 
the data sets into a number of classes based on the intrinsic similarity with- 
in each class. The physical identification of each class is done by checking 
a small area belonging to each class after the data processing. In this 
respect, the application of unsupervised techniques is in the reverse order 
of the supervised technique. The advantage of processing the data in the 
former order is that the investigator shall know better where to select the 
ground truth. Another advantage is for on-board data compression to minimize 
the rates of data transmission from future spacecrafts to the ground receiving 
stations. The third advantage is that automatic change analysis of earth 
resources study can be more logically carried out by the unsupervised technique 

The new unsupervised classification technique for classifying multi- 
spectral remote sensing data which can be either from the multispectral 
scanner or digitized color-separation aerial photographs consists of two parts: 
(a) a sequential statistical clustering which is a one-pass sequential variance 
analysis and (b) a generalized K-means clustering. In this composite 
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clustering technique, the output of (a) is a set of initial clusters which 
are input to (b) for further improvement by an iterative scheme. 

Applications of the technique using an IBM-7094 computer on multispectral 
data sets over Purdue's Flight Line C-l and the Yellowstone National Park 
test site have been accomplished. Comparisons between the classification maps 
by the unsupervised technique and the supervised maximum liklihood technique 
indicated that the classification accuracy is comparable to each other. 
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Section I 

INTRODUCTION 

Applications of nonsupervised clustering techniques have recently 
attracted more attention for processing and analyzing multispectral data 
obtained by remote sensing of the earth's resources and environment (refs. 1-7). 
In the past, main approaches in dealing with these types of data were based on 
supervised classification techniques which required reference spectral target 
signatures from training areas (ref. 8). 

One of the most serious drawbacks of the supervised classification tech- 
niques is associated with the high variability of the reference spectral sig- 
natures. These signatures depend not only on different physical targets of 
interest, but also on the following factors (some known and some unknown in 
the process of data gathering) : 

• Background materials 

• Atmospheric and meteorological conditions 

• Different physical location and orientation 

• Time of day, different reason of data collection 

• Sensor scan angle and sun elevation and azimuth 

• Different stages of plant growth 

• Different land use practices. 

With so many variable factors affecting the remote sensing data, it is 
very difficult, if not impractical, to set up an operational system for estab- 
lishing the reference spectral signature (or ground truth) library. So far, 
the users of the supervised classification methods mainly obtain the reference 
spectral signatures directly from training sets which form parts of the test 
areas. Even with this practice, it still requires much human judgment and 
intervention to select proper training areas for obtaining sufficient accuracy 
of classification. 

The nonsupervised classification, or clustering, techniques avoid most of 
the above difficulties and operational Impracticability. The clustering 
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technique does not require the reference spectral signatures. In essence the 
technique will group the sample data into a number of classes, all of which 
are statistically homogeneous. Finally, the physical identification of each 
class is accomplished by collecting the ground truth from a suitable size area 
belonging to that particular class. In this sense, the application of cluster- 
ing techniques to the multispectral data analysis is in the reverse order of 
the supervised classification technique. The advantage of processing the data 
in the order according to the clustering techniques is that it will be known 
better where to select the reference ground truth. 

Another advantage of the clustering technique is for data flow compres- 
sion in the telemetry of data from the spacecraft to the ground data receiving 
station. It is quite clear now that the data rate collected by the. satellite- 
borne sensors will be so large that present telemetry systems can not handle 
it. However, Dr. A. Park indicated that if the data can be compressed to 1/50 
or greater of the present volume, then the presently available commercial TV 
receiving station can be used for space data collection. It is quite feasible 
that the clustering techniques can process onboard the raw data and compress 
it into the acceptable reduced volume for this purpose, It is also possible 
in the hydrological applications to augment a relatively few number of ground 
sensors by the remote sensing data with the clustering techniques. 

Under the present contract, a new composite sequential K-means clustering 
algorithm has been developed with actual applications to two sets of remote 
sensing data; the Purdue agricultural field (Purdue C-l Flightline) and the 
Yellowstone National Park test site. The latter test site is actually in 
natural wilderness with various terrain types, forest cover and parts of it under 
cloud shadow. According to Dr. A. Park, NASA Headquarters, Earth Resources 
Program, this set of remote sensing data is about the most complex data 
collected under the NASA Earth Resource Program. Thus, it may offer the most 
critical test to date of the capability of the unsupervised clustering 
technique. If the technique can obtain an acceptably accurate classification 
map, then it may be safe to apply to other remote multispectral sensing data 
for earth resources and environment survey. 
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The principles behind the composite clustering technique will be pre- 
sented in Section II. Detailed mathematical algorithms, computer programs and 
users manual will not be given in this report, but will be included in the 
final contract report. Applications of the technique to the aforementioned 
two sets of data together with some supporting processing by other computer 
programs developed under previous contracts (ref. 9) will be given in Sections 
III and IV, respectively. A comparison of the unsupervised classification 
maps with Purdue LARS' results (refs. 8 and 10) will be made. Finally, some 
future developments and concluding remaks will be made in Section V. 
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Section II 

THE COMPOSITE SEQUENTIAL K-MEANS CLUSTERING TECHNIQUE 

The composite clustering technique developed essentially consists of two 
independent clustering techniques. The first is called the statistical 
sequential classification technique (SSC) (refs. 11 and 12) and the second is 
called generalized K-means techniques (GKM) (ref. 13). Therefore, each tech- 
nique will first be described, and then how they can be merged into one will 
be described. 

2.1 STATISTICAL SEQUENTIAL CLUSTERING 

The sensor collects multispectral data from a target which forms an image. 
An image can be composed of m scan lines of n resolution elements per scan 
line. Each resolution element yields a K-dimensional observation vector 
x(A i ), i = 1,2,... K, where A^ indicates the i*"* 1 spectral band. 

The purpose of the SSC program is to classify the given sequences of 
multisectional data into a specified number of subclasses; each of which is 
statistically homogeneous or similar in their spectral characteristics. To 
accomplish this goal, the program consists of four main steps: 

• Establishing new classes 

• Classifying new samples into established classes 

• Merging excessive classes 

• Displaying classification results and statistics. 

A flowchart of the main steps of the algorithm is depicted in Figure 2-1. 
Step 1 - all control parameters and statistical tables are read in. Step 2 - 
M (M = 6) samples are read in, which shall be tested to decide whether they 
come from the same population. If they do, they will be designated as the 
first population. If they do not, then the first sample will be dumped into 
a null-class, which contains all the samples unidentifiable, and then read in 
a new sample as shown in Steps 6 and 7. These new M samples will be tested 
once again in Step 3 to see whether they constitute a new population. The 
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above process will be repeated until the first population is established. The 
statistical parameters of interest for this population are calculated in 
Step 5. 

Next, one proceeds to check whether the end of the entire sample sequence 
is reached. If it is, the program will print out the final results of the 
number of samples in that population; the corresponding sample mean vector, 
covariance matrix, and classification map. The latter map represents a 2- 
dimensional spatial location of samples from each population. After this 
printout, the program will terminate itself. If there are still samples left, 
the program will proceed to check whether the total number of established 
homogeneous populations exceed the prescribed number. If the answer is yes, 
the program will proceed to Step 11 in order to reduce the number of estab- 
lished populations back to the prescribed number. This is accomplished by 
combining two populations that are most similar to each other into a enlarged 
population encompassing all those samples belonging to the two original popu- 
lations. Subsequently, the program will also recalculate the corresponding 
sample mean vector and covariance matrix in Step 12. If the answer is no, 
then the program proceeds to Step 13, to read in a new sample. The sample 
is then subjected to another test to see whether it belongs to any established 
population in Step 14. If the answer is yes, the sample is added to that popu- 
lation where it belongs, and the corresponding sample mean vector and covariance 
matrix are updated. This process is repeated until a new sample is encountered 
which does not belong to any of the established populations. This new sample 
will be held in a temporary hold location until M such samples have been 
accumulated. These M samples are then tested to see whether they constitute 
a new population as was done for establishment of the first population. If 
the test is affirmative, then a new population will be set up for them and then 
continue to Step 5. If the test is negative, the sample which is held first 
in the temporary hold will be dumped into the class of unidentifiable class, 
then proceed to read in a new sample. This process is repeated until all the 
sample sequences have been processed. The final outputs of the whole algorithm 
is to print out the number of samples, the mean vector and covariance matrix 
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for each population, a divergence matrix among all the populations, and 
finally a 2-dimensional map of spatial locations of samples for each popu- 
lation. 


2.2 GENERALIZED K-MEANS CLUSTERING 

The generalized K-means algorithm essentially consists of three steps 
plus an additional step for displaying the 2-dimensional map of clustering 
results. This algorithm is an improved version of the existing K-means 
algorithms (refs. 14 through 17). 

2.2.1 Step 1 - Estimation of Initial Cluster Centers 

Let the sample sequence be denoted by (x^(A_.), i = 1, 2, ..., M and 
j =1, 2, 3, ..., N}. Here i denotes the sample number and j denotes its 
components. The first initial cluster center will be the first sample, i.e., 

C^CAj) = x x (A ) (2-D 

The second initial cluster center C 2 will be the sample which has the fartherest 
distance from C^, i.e., 


C„(A.) = x.(A.) with the maximum of 
2 l i 3 


N 

l 

3=1 


[X i ( V - 


x^Aj) ]' 


over all i. 


( 2 - 2 ) 


th 

The (k+1) initial cluster center (for k > 2) will be the sample which 
has the maximum of the minimum distances among all with respect to the estab- 
lished k initial cluster centers, i.e.. 


C. , , (A .) = x. (A .) with 
k+1 j i 3 


max 

i 


min 

k 


r N 


l [x (A ) - x (A )] 
.3=1 J J 



(2-3) 
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The results of the above procedure is to plant evenly the initial cluster 
centers whose number will be prescribed over that part of the measurement space 
occupied densely by the given input sample sequence. The step of estimating 
initial centers is relatively time consuming. The computer time requirement 
will be proportional to 

m | MNK 2 (1 - | + %) (2-4) 

K 

where K is the total number of cluster centers. Clearly, the computer time 
required is linearly proportional to the total sample M and number of components 
per sample N, respectively, but almost to the square of the required number of 
cluster centers, K. Usually, N is fixed, but in general one would expect, 
without any prior knowledge, that K would increase with M. 


2.2.2 Step 2 - Preliminary Improvement of Cluster Centers 

This step of improving accuracy of the initial cluster centers is exactly 
the same as used in the present K-means algorithm. "Preliminary” is used here 
because another improvement to the cluster centers will be made after this 
step as discussed in Step 3. The minimum distance criterion is employed. The 
entire sample sequence is classified into K groups by calculating the distances 
of each sample with respect to each cluster center and classifying the sample 
into that particular center that yields the minimum distance, i.e.. 


x. (A .) -*■ C. (A.) if 
i 3 k J 


f [x.(A ) - C k (A )] 
j=l 


2 


is the minimum over all k. 


(2-5) 


This classification is equivalent to set up a system of hyperplane decision 
boundaries to separate K clusters. Once this is done, the sample belonging to 
each cluster center is used to calculate its mean measurement vector (or center- 
of-gravity) . These updated K centers will now be regarded as the initial 
cluster centers for the next iteration. The procedure will be repeated until 
the difference (or distance) between two successive iterated values of every 
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cluster center is smaller than some prescribed threshold value. In general, 
one would expect that smaller threshold values result in better (or more 
accurate) results, but it also requires a larger number of iterations. Some 
compromise is thus obviously called for and no general rule can be specified. 

2.2.3 Step 3 — Final Improvement of Cluster Centers 

The reason for requiring some further improvement to the cluster centers 
as obtained from Step 2 can be illustrated by using Figure 2-2 . In this figure 
there are three natural clusterings in 2-component scattering diagrams. 

Further, these three clusters are clearly linearly separable, thus it is de- 
sirable to separate the samples into three clusters. Using Step 2, the best 
results obtainable, after a sufficient number of iterations, is shown by the 
linear minimum-distance decision boundary as indicated by the solid lines. 

Parts of samples actually belonging to cluster No. 1 are mis-classif ied into 
clusters No. 2 and 3. This resulted from the fact that inter-distance between 
clusters No. 1 and 2 (similarly for cluster No. 1 and 3) is about equal to the 
sum of the two intra-distances of the individual clusters of which one is much 
larger than the other. 

This hypothetical example is actually a very common case in the multi- 
spectral observations of earth resources and environments. Investigators of 
spectral signatures have pointed out this difficulty many times. 

One way to deal with this difficulty and thus improve the power of the 
present K-means algorithm will be proposed. To the first approximation, the 
intra-distance of samples within one cluster will be the sample standard devia- 
tion vector that is the square roots of the diagonal elements of the sample 
covariance matrix. Except for the very elongated cluster, this sample standard 
deviation vector may be characterized by a single scalar, i.e., the root mean 
square of the standard deviations of the components . This characterization is 
completely correct if each component has the same standard deviation. With 
this basic understanding, the minimum-distance criterion used in the present 
K-means algorithm can be replaced by a more general similarity criterion with 
the standard deviations as weights to better locate the decision hyperplanes. 
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Figure 2-2. COMPARISON OF THE PRESENT AND GENERALIZED K-MEANS ALGORITHMS 
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The proposed similarity criterion can be expressed as 


x. (A .) -> C. (A .) if 
i 3 k 3 


—Z y [x . (A . ) - C, (A.)] is mini 
2 . . i i k i 

, i=l ■ J 


minimum over all k. 


(2-7) 


S k ^ 


Here, is the characterized sample standard deviation for k cluster center. 
The rest of the step will be the same as in Step 2. 


Three important points germaine to the added step will now be discussed. 
First, one might ask why not use Step 3 with the more general similarity measure 
exclusively, i.e., eliminating Step 2 altogether. The answer is that the 
sample standard deviations for K cluster centers may not be accurate enough 
at the first few iterations in improving the cluster centers and that their 
evaluations are more apt to the influence of misclassif ied samples than the 
centers-of-gravity of clusters. Hence, there is no clear indication to 
expect better performance from Step 3 than Step 2 at the first several 
iterations. Therefore if Step 2 is employed to its utmost capacity, then 
the best possible estimate of the sample standard deviations is obtained, 
and the true power of the more general similarity will prevail. 


The second point is concerned with whether Step 3 with additional evalua- 
tion of sample standard deviations will be very time consuming. The answer is 
no, since in Step 3, as well as Step 2, the square of the distance of each 
sample with respect to each cluster center should be calculated and classify 
it to the cluster center with the shorter distance. The evaluation of sample 
variance for each cluster center can make use of the above calculation by 
adding a simple updating routine for accumulation. Therefore, each iteration 
of Step 3 will take only slightly more time than that of Step 2. 

The last point is that Step 3 will not degrade the results from Step 2. 

As has been demonstrated the misclassif ication may occur by Step 2 only if the 
intra-distance of samples in any cluster center is greater than half of the 
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inter -distances between the two clusters. Further, the proposed Step 3 can 
remedy this difficulty. Step 3 will do just as well as Step 2 for the cases 
that Step 2 can do perfectly, i.e., the cases in which the inter-distance 
between two clusters is much larger than the sum of the intra-distances for 
individual clusters. This can be easily shown by noting that the intra-distance 
for each cluster center should be shorter than only one-half of the inter- 
distance between the associated pair of cluster centers in order to have a per- 
fect (i.e., completely correct) classification. Consider the most trying but 
still completely separable clustering by the minimum-distance criterion, 
namely, the larger of the two intra-distances is equal to one-half of the 
inter-distance between clusters and the shorter one is much smaller. For such 
a case, this generalized similarity criterion will set the hyperplane decision 
boundary at a distance twice the shorter intra-distance from the cluster center. 
So, a perfect classification will also result. 

It is worthwhile to note that the cluster centers established by Steps 1 
through 3 can be joined or merged together in order to reduce the total number 
of cluster centers. However, with regard to saving of computation time, it 
will be better to start off using fewer clusters than merging the established 
clusters . 

The results of clustering by Step 3 can be displayed in a 2-dimensional 
map for the multispectral observations such as the multispectral line scanner. 

In addition, all the statistical parameters and sample probability density 
functions can also be calculated at the last iteration of Step 3 and printed 
out together with the 2-dimensional map. 

2.3 MERGING OF SEQUENTIAL AND K-MEANS CLUSTERING 

Before describing how the statistical sequential clustering technique and 
the generalized K-means clustering technique can be combined into more power- 
ful clustering techniques, the merits and drawbacks of each technique will be 
discussed. This review then points out a natural way for combining these two 
techniques . 
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The single most significant advantage of the SSC is that it requires 
only one pass of the entire data sequence to achieve fairly good clustering of 
the given data. This truly sequential feature, to the author's knowledge, has 
never been accomplished in any existing clustering techniques. This feature 
also permits fairly fast computation. However, because of only one pass of 
the data sequence, the null class of unidentifiable data samples that resulted 
from establishing new classes can not be reexamined, which is the main drawback 
of the SSC technique. 


The most significant advantage of the GKC technique is that it possesses 
the capability for repetitive correction and updating of the establishing cluster 
centers. Its main drawbacks is that the procedure for choosing the initial 
cluster centers is quite arbitrary and requires as many passes of the entire 
data sequence as the number of cluster centers. Furthermore, because of these 
rather inaccurate initial cluster centers, many iterations of the entire data 
sequence will be further required to achieve good clustering accuracy. 

From the above comparison of these two techniques it is clear that they 
can complement each other, since the drawbacks of each technique can be elim- 
inated by properly merging the two techniques. The composite clustering 
technique is then composed of two main steps: 

(1) The given data sequence will be processed by the SSC technique with 
only a single pass of the entire data sequence. The outputs of the 
processing will be the mean spectral vectors of clusters. 

(2) The mean spectral vectors from (1) will be used as the initial cluster 
centers to the KGC technique. In order to allow for extra cluster 
centers from the null class of the SSC in (1) , the original KGC 
procedure for establishing extra initial cluster centers can be 

used as many times as desired. Next, the initial cluster centers 
will be iterated about 2 to 3 times to obtain the final clustering. 


In short, the above composite clustering technique can accomplish good 
unsupervised classification of a given data sequence with about four passes of 
the entire data set regardless of the preset number of clusters. 
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Section III 

UNSUPERVISED CLASSIFICATION OF AGRICULTURAL REMOTE SENSING DATA 

In order to test and demonstrate the capability of the non-supervised 
clustering technique, a set of computer programs has been developed. The pro- 
gram to process and analyze a set of most well-known multi— spectral data which 
was made available by the Purdue University's Laboratory for Applications of 
Remote Sensing was employed. 

3.1 DATA DESCRIPTION 

The data were obtained by the University of Michigan multispectral scanner 
over an agricultural experiment test site near Lafayette, Indiana, from a 
flight altitude of 2600 feet on June 28, 1966. This set of data was designated 
as Purdue Flight Line C-l. In particular, only the results from scans 587 
through 797 are presented for the purpose of comparing our nonsupervised 
classification results with LARS's supervised classification results of the 
same area (ref. 18). 

3.2 PRELIMINARY DATA ANALYSIS 

Figure 3-1* shows the aerial photo of the target area (about 1 square 
mile) with the ground truth designation superimposed. The multispectral 
scanner recorded simultaneously 12 channels of spectral bands reflecting from 
the earth's surface between 0.4 and 1.0 ym. These spectral bands are listed in 
Table 3-1. Again for the purpose of comparison with LARS results only 4 
channels were used, i.e., channels 1, 6, 10, and 12 . These 4 channels have 
been determined by LARS to be the optimal 4-channel feature selection (based 
on the divergence measurement) for the flight line C-l data (ref. 18). 

Figures 3-2 through 3-5 show the probability histograms of each indi- 
vidual channel, respectively. These histograms clearly show that the 
majority of resolution elements (or target) having the spectral radiance 
between 140 and 200, with the total radiance range being 0 to 256. Further, 

'* Figure 3-1 through 3-35 are presented following the text at the end of this 
Section . 
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Table 3-1. SPECTRAL BANDS OF MICHIGAN MULTISPECTRAL SCANNER 


CHANNEL NO. 

SPECTRAL BANDWIDTH 
(microns) 

CHARACTERISTIC 

COLOR 

1 

0.40 - 0.44 

Violet 


2 

0.44 - 0.46 

B1 ue 


3 

0.46 - 0.48 



4 

0.48 - 0.50 

B1 ue-Green 


5 

0.50 - 0.52 



6 

0.52 - 0.55 

Green 

■ Visible 

7 

0.55 - 0.58 



8 

0.58 - 0.62 

Yellow 


9 

0.62 - 0.66 

Red 


10 

0.66 - 0.72 

Red 




> 


n 

0.72 - 0.80 

'I Reflective 

) 

12 

0.80 - 1.00 

J near infrared 


several distinct peaks were observed in each histogram which indicate the 
mixing of several different populations as expected. However, these peaks are 
not completely separate. This fact implies that more than one channel out of 
these four would be required for discrimination between different populations. 

Figures 3-6 through 3-9 show the corresponding grey-level plots of the 
channels used. The road running in the flight direction in the middle of the 
area is indicated by a blank. Several other rectangular agriculture fields can 
also be observed from these plots. In particular, one can see the close 
correspondence of the two wheat fields in Figures 3-1 and 3-8. It should be 
noted that the complement of the numerical value with respect to 256 is pro- 
portional to the spectral radiance received by the scanner. Hence, the larger 
the numberic used in the grey-level plot, the smaller the spectral radiance. 

Figures 3-10 through 3-12 show three scatter plots between channels 
1, 6, and 10, The number 1 through 8 used indicates the number of samples in 
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each spectral cell, while number 9 indicates the number of samples to be 9 or 
greater. Two things can be observed from these plots. First, the scatter 
patterns of Figure 3-10 with Channel 1 versus Channel 6 and Figure 3-11 with 
Channel 1 versus Channel 10 are alike. This indicated that Channel 6 and 
Channel 10 are possibly linear related. This prediction is further confirmed 
by the scatter pattern in Figure 3-11 with Channel 6 versus Channel 10. Second, 
the scatter pattern in each figure does not indicate clear cut clusters, which 
implies the impossibility of completely correct discrimination basing on any 
two-channel pairs out of channels 1, 6, and 10. That is, three or more chan- 
nels of data are needed simultaneously for discrimination between different 
crops in this set of data. 

Figure 3-13 shows an inventory boundary map by the boundary enhancement 
technique (ref. 7) for the target area. One can see the very clear correspon- 
dence of the boundaries of different crop fields between this map and the 
aerial photo (Figure 3-1). One purpose of generating the boundary map is to 
establish the spatial registration between the multispectral data and the 
ground scene based on the aerial photo so that the training set can be selected, 
if needed, as in the supervised classification by LARS. So much for the 
preliminary data analysis of this particular target area. In the following, 
the results from the non-supervised classification techniques will be discussed. 

3.3 UNSUPERVISED CLASSIFICATIONS 

In order to see more clearly the advantage of the composite clustering 
technique, the results employing, separately, the SSC technique and GKC tech- 
nique was presented first. 

Figures 3-14 through 3-19 show the classification maps by the statistical 
sequential clustering technique for the numbers of 18, 17, 16, 15, 14, and 
13 classes, respectively. Actually, only Figure 3-14 with 18 classes was 
processed from the data directly. The other classification maps were obtained 
consecutively by merging the two most similar classes based on the minimum 
distance criterion. It is interesting to examine the merging process in this 
series of classification maps. The 18 classes in Figure 3-14 are designated by 


3-3 



NORTHROP SERVICES. INC. 


t z=m i 


alphanumeric symbols 1, 2, .... 8, 9, A, B, C, E, F, G, H, and I, respectively. 
The class (I) appears in the last scan 797 from sample numbers 115 through 221 
in Figure 3-14. This class was merged into class (1) , as shown in Figure 3-15. 
Next, the class (H) scattering in the rectangle defined by scans 645 and 699, 
and sample numbers 1 and 45 in Figure 3—15 was merged into class (4) as shown 
in Figure 3-16. Next, class (G) occupies the rectangle defined by scans 707 
and 797, and sample numbers 1 and 19 in Figure 3-16 were merged into class (C) 
as shown in Figure 3-17. Next, class (F) occupies the right side of scans 791 
and 793 in Figure 3-17 were merged to class (1) as shown in Figure 3-18. Up 
to this stage, four classes (I, H, G, and F) have been merged into other 
classes. It was noted that the number of samples for each of these four classes 
is relatively small compared with the total number of samples in the target 
area. However, in the next merging, the very large class (2) in Figure 3-18 
was merged into another large class (1). Comparing the classification maps of 
Figures 3-18 and 3-19 with 14 and 13 classes, respectively, against the aerial 
photo, it clearly shows that classes (1) and (2) should be two separate 
classes. Thus, one may conclude that 14 classes may be the most natural 
classification of this set of data. Among these 14 classes, the smallest class 
containing only 19 samples is designated by symbol E in Figure 3-18 or symbol 
2 in Figure 3-19. 

Next, Figures 3-20 through 3-27 show the classification maps by the 
generalized K-means clustering technique alone on the same set of data. 

Figures 3-20 through 3-22 give the classification maps with 18 classes for 
three stages of clustering, i.e., no iteration and after one and two iterations, 
respectively. The rest of the classification maps were generated consecuitvely 
by merging the two most similar classes based on the minimum distance criterion 
down to 13 classes. 

By comparison of the corresponding classification maps by the statistical 
sequential technique and by the generalized K-means technique, with regard to 
the ground truth map (Figure 3-1) , it seems that the performances by both 
techniques are about the same with about 70 to 80 percent correct classifica- 
tion accuracy (or clustering). It should be noted that for the same accuracy 
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of clustering it took only one pass of the data set by the statistical 
sequential technique, while it took 20 passes (18 passes for establishing 
initial cluster centers and 2 passes for updating these cluster centers) by 
the generalized K-means technique. 


The results by the composite sequential K-means clustering technique will 
be discussed next. Figures 3-28 through 3-35 show the classification maps of 
the same target area (Figure 3-1) by the composite technique. Figures 3-28, 

3-29, and 3-30 give the classification map with 13 classes. The classification 
maps were generated by using the mean vectors of the four channels 1, 6, 10, 
and 12 of the 13 most populous classes obtained by the statistical sequential 
technique (Figure 3-18) as the initial cluster centers into the generalized 
K— means technique. Figure 3—28 gives the classification without any updating 
of the cluster centers, while Figures 3-29 and 3-30 give the classification, 
respectively, after one and two iterations. One can see clearly that even 
without any updating of the cluster centers, the simple reclassification by the 
K-means technique has produced great improvement in accuracy. After only two 
iterations (or updating) of the cluster centers, the classification map cor- 
responds very well with the ground truth map (Figure 3-1). From Figure 3-30, 
one can see that the wheat fields are classified into three classes (6, 8, 

and C); corn fields into 3 classes (1, B, and D) ; oats into two classes (9 

and 7); soybeams into 2 classes (A and 4); while hay, alfalfa, red clover, 
and pasture are collectively into two classes (2 and 7). The fact that each 
of the four crops - wheat, corn, oats, and soybeans are grouped into more than 
one class simply implied that there existed variations within each spieces of 
crop. The important point is that the three classes (6, 8, and C) representing 

wheat, for example, do not mingle with the other crops. Hence, the clustering 

results for these four crops should be considered correct. On the other hand, 
lumping all the other crops - hay, alfalfa, red clover, etc., together into only 
two classes (2 and 7) is due to their very close resemblance in the spectral 
signature in the four channels used. The difficulty of distinguishing these 
crops has also shown up in the supervised classification results by the LARS 
program which will be discussed further later when a comparison is made with 
these and LAR's results. 
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The road running vertically through the center of the target area will 
now be discussed. In Figure 3-30, -the road is designated by class symbols 8, 

B, A, D, 4, and C. Certainly, this designation is not correct. This mis- 
classif ication of the road, however, can be easily corrected. This is accom- 
plished by increasing one more class, i.e., from 13 to 14, using the K-means 
technique prior to updating the initial cluster center. The results of this 
processing are shown in Figures 3-31 through 3-33. Figure 3-31 is obtained 

without updating, while Figure 3-32 and 3-33 are obtained, respectively, after 

th 

one and two iterations. This 14 class (E) unmistakably indicates the road 
as one can see in the middle part (vertically) in Figure 3-33. 

Table 3-2 summarizes the quantitative results from the last classification 
map (Figure 3-33). It may be noted that there are 2 small classes, i.e., 
classes (3) and (5), with samples 5 and 28, respectively. Both of them belongs 
to the wheat field at the left of the map in Figure 3-33. They show much 
stronger spectral radiances in channels 6 and 10 compared with other classes. 

The causes for this fact is not clear, because of insufficient ground truth 
information available. Note that only four passes of the data set were required 
for the classification map by the composite clustering technique. 

3.4 COMPARISON WITH SUPERVISED CLASSIFICATION 

As mentioned earlier, the reason for choosing the particular data set for 
testing our composite clustering technique is for comparison with the super- 
vised classification results obtained by LARS using the maximum liklihood 
classification technique. LARS's classification map is reproduced in Figure 
3-34 employing the same four channels as were used for the composite clustering 
discussed above (ref. 18). The training fields used in the classification 
program are outlined with asterisks (*) and the test fields are outlined with 
plus (+) signs. The tabulation of classification results of the test fields 
is also reproduced in Figure 3-35. The test fields chosen in LARS classifi- 
cation covered only about 5989/11660 = 51.5 percent of the entire field. The 
overall performance of correct classification is 87.5 percent. Actually, the 
entire field has been classified by the LARS program, as is evidenced by the 
classification symbols covering the entire field. The so-called test fields 
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Table 3-2. SUMMARY OF MEAN SPECTRAL RADIANCES OF 14 CLASSES 
BY THE COMPOSITE CLUSTERING TECHNIQUE (FIGURE 3-33) 


CLASS 

NUMBER 

CLASS 

SYMBOL 

NO. OF 
SAMPLES 

CHANNEL 1 
.4-. 44 ym 

CHANNEL 6 
.52-. 55 ym 

CHANNEL 10 
.66-. 72 ym 

CHANNEL 12 

.8-1 .0 ym 

1 

1 

1575 

mm 


183.8 


2 

2 

3103 

■gs 

- 1 

175.0 


3 

3 

5 

■n 


88.8 

HB 

4 

4 

1798 

mm. 

■ 

165.9 

WBBM 

5 

5 

28 

166.8 

mm 

106.5 

163.7 

6 

6 

523 

178.2 

166.2 

152.3 

182.3 

7 

7 

768 

181 .9 

176.3 

182.2 

163.8 

8 

8 

318 

174.2 

152.7 

138.3 

176.2 

9 

9 

255 

180.4 

172.6 

165.0 

172.2 

10 

A 

2134 

159.3 

154.8 

159.4 

181 .1 

11 

B 

852 

169.0 

167.3 

172.4 

184.9 

12 

C 

165 

168.6 

140.9 

127.8 

170.6 

13 

D 

1736 

172.1 

165.8 

169.4 

169.8 

14 

E 

49 

140.7 

142.5 

149,4 

178.7 


Total No. of Samples = 11,766 

on the map are just the "selected" areas for computing the accuracy of correct 
classification. One can see clearly that the overall performance would be less 
than the cited 87.5 percent, but about 80 percent or less, if the overall 
performance is based on the entire field. It is also noted from the LARS 
classification results, as well as the map, that red clover, hay, and alfalfa 
are fairly similar to each other, with very little discrimination among them. 
Comparing the classification map by the composite clustering technique (Figure 
3-33) with the LAR's results and with the ground truth aerial photo (Figure 
3-1) , the overall performance by the composite clustering technique over the 
entire field is close to 80 percent. That is, the overall performance by the 
LARS supervised classification technique and by the unsupervised composite 
technique are comparable. However, it is important to recall that no training 
fields or any other ground truth information has been employed in applying the 
unsupervised technique. 
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RC 
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Figure 3-2. PROBABILITY HISTOGRAM OF CHANNEL 1 (0.4-0.44 pm) 
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Figure 3-3. PROBABILITY HISTOGRAM OF CHANNEL 6 (0.52-0.55 v m) 
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Figure 3-4. PROBABILITY HISTOGRAM OF CHANNEL 10 (0.66-0.72 u m) 
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Figure 3-5. PROBABILITY HISTOGRAM OF CHANNEL 12 (0.8-1 .0 ymj 
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Figure 3-6. GREY-LEVEL PLOT OF CHANNEL 1 (0.4-0.44 ym) 
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Figure 3-7. GREY-LEVEL PLOT OF CHANNEL 6 (0.52-0.55 pm) 
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Figure 3-9. GREY-LEVEL PLOT OF CHANNEL 12 (0. 8-1.0 ym) 
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Figure 3-10. SCATTER PLOT OF CHANNEL 1 VERSUS CHANNEL 6 
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Figure 3-11. SCATTER PLOT OF CHANNEL 1 VERSUS CHANNEL 10 
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Figure 3-12. SCATTER PLOT OF CHANNEL 6 VERSUS CHANNEL 10 
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Figure 3-13. INVENTORY BOUNDARIES BY THE BOUNDARY ENHANCEMENT TECHNIQUE 
FOR PURDUE C-l FLIGHT LINE 
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Figure 3-14. CLASSIFICATION MAP BY THE STATISTICAL SEQUENTIAL TECHNIQUE 
WITH 18 CLASSES 
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Figure 3-15. CLASSIFICATION MAP BY THE STATISTICAL SEQUENTIAL TECHNIQUE 
WITH 17 CLASSES 
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Figure -3-16. CLASSIFICATION MAP BY THE STATISTICAL SEQUENTIAL TECHNIQUE 
WITH 16 CLASSES 
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Figure 3-17. CLASSIFICATION MAP BY THE STATISTICAL SEQUENTIAL TECHNIQUE 
WITH 15 CLASSES 
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Figure 3-18. CLASSIFICATION MAP BY THE STATISTICAL SEQUENTIAL TECHNIQUE 
WITH 14 CLASSES 
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Figure 3-19. CLASSIFICATION MAP BY THE STATISTICAL SEQUENTIAL TECHNIQUE 
WITH 12 CLASSES 
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Figure 3-20. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 18 CLASSES AND NO ITERATION 
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Fiqure 3-21. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 18 CLASSES AFTER ONE ITERATION 
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Figure 3-22. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 18 CLASSES AFTER 2 ITERATIONS 
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Figure 3-23. 


CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 17 CLASSES 


3-30 
























NORTHROP SERVICES. INC. 


TR-1075 


CLASS l 2 1 1 * * 7 0 * IQ II 12 li |« I !> I* 1/ )• 

I 2l1i*7|*OA»CD tffcM 


NO. or CLASSES H £ x fc t D MOH It TO 14 


SAM PLE N U M B t H 

unoououooooojoouoooaooonooouuooooooouooooooonoooooi iii|iitiiiUiimiiiiimmi|iuiiujuuilniWi2 i 2i2 i 2#2 J 

UnOOOl I I 1 l22222JJl330«4«mSSSSfc**4A47 7?7 7M9i0t9flfOQQOO» I 1 1 I 21222 J3 J3i^N4N*»S»fcSfcA444477 M 7MIII»»f»tooOOy I I l U 2 




777|fe||Mll«leai>i)13tB6< 

7A77/|»ttI*tbB8BB|JJ«*B. 

7j777|MJ|B#aB8BaBBB^Bir 

7 T 7 I 7 B0B0AB3'3JBBOR08B36[ 
77877aebBaJJBBBBOBBBaJJl 
77e0BB30J33B860B0033AB3| 
777jBBBBhB->aoBBa->lJ}»JJ| 
a 7 7 7 7 8 7 7 7 7 7 7 7 7 B J 3 *BB B 3Bl 

07777777777777980000000! 

Bn7Db8777777/7BeBBBBRBe; 

077777777U/77flea|BBVl3; 
7 1 7 B 7 7 7 7.7 /?0001b0HB8r 

7 7 7 7 b B 7 7 7 7 7 7 7 7 7 08880809 
7 »77 77777887777 1 1 bb3\1B 
7 ; 7 7 7 7 7 7 7 7 7 7 7 7 B B 0 B 0 B 8 B B 
7 7 7 7 7 7 7 7 7 7 7 7 7 7 0 8 8 0 0 0 8 0 0 
7 7 7 / 7 7 7 7 7 7 7 7 7 7 7 0 /800801 
7777/7B77B0778000II00A00 
7. /7 7 7 / 77 70 7 700000800803 
7 7 7 7 7080030333000 It I 133 
7 7 7 7 700 Bb30333B0 3 B 0 0 h 3 0 
7 7 7 7000B0B008B0BB 1BB80B 
7 7 7 7o00j00ttB0B000B0Bn00 
700BB03333333JJJBU0M0B 
7 0860003333300330300800 

7«BBJi33338B3Ji#J 360800 
C72222L22A220DOOPOOOUOO 


■ 122222 722 7 D 222c22 
•0000272222CC2COC 
>0000000027222071 

1OOODD7OOO2OO2OOO 
10000072002222000 , 
I000000000222000D J 
*0 0 00 Or> 000 7 2000 Do 
)00000o000oc02000 

>ui/uuouuwwi7wuww0D000002D0LiC02 200 
OOOOOOOOODOOODOOOOOU00002C22072 
OOOOoOOOOOOOlOOOOOOOnOOOOCOlOlDll 
0000o0ODC20OODDO0D02n2L27C0200o2 
OOOOOOOOOOOO 0 OOOOOOO 7 DOO 2 2 202072 
0DD00D020000000OJP00D00022D20002 
0 0 OOoODDu jOOoO oK 0 2 2 n D 00 ( 2220700 
opDODPOODOOOOOoMoOOn 00 020222020 
00000000000OU000OL)00r>0oD20222<72 
OOOOOOOOPOOOOOOOOOOOoOO 202022002 
OOOOOOOOOO A A00000000n0022222 22C 2 
0OU0UDU000OA0000000OOD00O00A2O2A 
OOO 00 OOOOOOO 00 00 DO ODnOD 22 000 00 7C 
ODODOOOOOOOOoOOODOOOnDOA00220AOt 
OOOOOOOOAOOOODDOODOOnDO? i L l 0 0 % 0 0 

OOOOoOOAUOUOi/OOOOOODn002o220c020 

00002000000UOD0000027A02uA0027oO 

00D00DDDD0OOoD0000O2/2a2a0222210 

2DOOOODOOODOuOOOOOOOuOaOa2002222 

OODOOOOOOODOoOOOOOOAnAOD22222272 

DODOoOOOOOOOuOOOOOAAt3702D2222272 

2L22oOOODOAOoOOOOOOOr>0002DDOOOOO 


iktktEtkkkkkkCttkkkkkCktktkk£Ckt2221kCkJ 
nn,iuiitfiiniiniinmnn/»l | n[J 
■.film I 1 M HU / null t ft| I I B i I 1 1*1 1 1 1 II II 1 1 1 1 II ll if 

n< t in i 1 1 1 .'I 1 1*1 1 7 r 1 ' n 1 » r 

0200022212002DA022U002000000000000000000 
j0D2220222A0UAAa0000u200oDU000O2O2200000 
u0002200#OOaOOAAOoOOoDOOOOOoOOo2222UOOSi> 
2D0D2220AAA0DAUOO0000000000OU0UU22O0ODO0 
2 0 000 A A A A A AAOAUWOU2DOOOOOOOaODDODDOODOOO 
iOUOOAAOIDOODDOOOOOO^OOAUOAOuOOUOOOOOOOO 
JOOOaOOOUOOOOOuoOOoC A A ADOODl/DODOOOiJOOOO 
uOuoaOuuUaauuOaaaoo Mj DUO 000 00 00 0 2 0 DU000 2 
u020oO00AA0000Oo0000o00U000000O00022CC2o 
u222202OI020000uOD0OoDU0u0U000O02222CCCC 
002220000000 aAOa0222o002U020020I20CC0020 
ul>O7oaO70ODDOAULiO22C)U2 2OOOOl222 2 UrfUfc 

ni'in;ia**i*'*i 7 i*tiitni \ n \ i i t m n 1 1 




tOkkkUfUllikkliNtkt 

^111 888888'*88lH8tOaEuOttttt8EtttU2tl 
0 *• OOU a 00 •• 0 1 8 H t U N H t H l t 0 N H 8 8 8 H 8 ■« 1 1 1 Oi k E E 2 

00OUo00OU80k00 , *tOl , *000N8<i8N‘»t0tklttt2Et2 
0«' 4 0‘«0O00O0k00kkU0LU0k0O002A222kuO2kuOkk 
U80U000000020o20U2002u2 2 2 2222 7 2 2kCj22ooEkk 

U‘.Q*QoHQo2222202 22002U22 222222kitUttttEkt 

20*0*400102 2 U 2 UO **U‘* 0 *« tO JN H*kOEkO£2tk22Ckt 
2 O 0 N 8 0 A N <4 «0 *1000 N P|8 <t*NO^k k E N <4 Ek k k ok A 2 E H E 
OO0OOH<l'4«48OOOOANH*4ON*IH<4‘18kOO8*«ENiEkk22Ckt 
2UOOO***0****U*OOOHOOOOo2Eu*Qt 8 8*4*4 kkkktkEEk 
2uOUOO*0*iOOOQQuUUUOQuOUU2EktHEE2££k t **Ekt 
20O0QjQUQAQAOQOUL 1 l0OkQ0Q0O2kkkkkiE 2kkLkL2| CCCkCCCCCC 
aUOOOuOvjUQOOOOOOOOOQOu 02 2 2fe22o22 222£222CC. 
Ql lCCCCCCCCCCCCCt fff f f f f S \ \ I ■■ ■ 1 / 7 . 1 

nU-icBbccctcccccccccccccccccccccccsccccc 

cokccoeoocccoccot 0000002 cccoooooccoooccct 

OCCCCC0009CDCCCCOCCCOCCCCCCOCCOOUCCCCCCCC 
OOOOcOOODlCCOcCCLCCCCCDcCCCCLCCCCCCCCCCCCj 
OcDOOODooDCCOOODWcODOoODCCCCCcCCCCCCCCCcd 
CcOODOOuOOOOOoOoOOoOOoOOccCCCcCCCCCCCCCcd 
OcOOOoOUUUDlDDOuODUUOOOUCCCCC-CCCCCLCCCCCt 

t00 00w0ccw0 00o0yy(.cct®0ccccccctcccccccccc 

OLODOODOODCOOcCLLCCCC^CCCcCCCcCCCCCCCCCCt 
UoOJQOODOODOCcCCCCCCtCCCCCCCCcCCCcCCCCCU 
UUCIOOODOOOOOOOOCCCOOCCOOCCCOCCCCCCCCCCCCI 
yoDOOoOOOOOUODOOODOOOcCCCCCtCcCCCcCCCCCCI 
OoOUOOCOOOOOOcCLkCClCCCCCCCkkcCCCCCCCCCCI 

OOOt»OLOoOBPOOCCCCCCCCCCCCCCCCCCCCCk £ N8E kj 
00 DOOUOOODOOC C C C C C C C C 2 2 2 22^Ckj_i»tfc-r^rnrr 

0 0 D 0 o 2 2 2 0 2 2 2 k Q*4N8JkSJE*-Hrr2T5V5s*4 <1 <4 *4 k E 
u 0 •* ** k H *4 k E k i * 3 S N <1 *4 8 « H - 1 •* <4 1 0 *4 





Figure 3-24. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 16 CLASSES 
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Figure 3-25. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 15 CLASSES 
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Figure 3-26. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 14 CLASSES 
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Figure 3-27. CLASSIFICATION MAP BY THE GENERALIZED K-MEANS TECHNIQUE 
WITH 13 CLASSES 
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Figure 3-29. 


CLASSIFICATION MAP BY THE COMPOSITE CLUSTERING TECHNIQUE 
WITH 13 CLASSES AFTER ONE ITERATION 
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Figure 3-30. CLASSIFICATION MAP BY THE COMPOSITE CLUSTERING TECHNIQUE 
WITH 13 CLASSES AFTER 2 ITERATIONS 
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Figure 3-31. CLASSIFICATION MAP BY THE COMPOSITE CLUSTERING TECHNIQUE 
WITH 14 CLASSES AND WITHOUT ITERATION 
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Figure 3-33. 


CLASSIFICATION MAP BY THE COMPOSITE CLUSTERING TECHNIQUE 
WITH 14 CLASSES AFTER TWO ITERATIONS 
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Fiqure 3-34. CLASSIFICATION MAP BY PURDUE UNIVERSITY LARS'S SUPERVISED 
BAYES CLASSIFICATION TECHNIQUE (Ref. 10, p. 40) 


3-41 



NORTHROP SERVICES. INC. 


TR-1075 


L ABORA TORY FUR AGRICULTURAL REMOTE SENSING 
PURDUE UNIVERSITY 


LAKSYSAA ILLUSTRATION **• 


CLASSIFICATION STUOY .. SERIAL NO. 705807300 
CLASS 1 F ICAI ION OAIE .. JULY 5, 1968 
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Figure 3-35. TABULATION OF CLASSIFICATION RESULTS OF TEST FIELDS 
(Ref. 10, p. 41) 
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Section IV 

UNSUPERVISED CLASSIFICATIONS OF NATURAL TERRAIN TYPES 


To give a more critical test and establish the capability of the un- 
supervised clustering technique, the most complex remote sensing data ever 
collected by the University of Michigan Multispectral scanner under NASA's 
sponsorship was chosen - the aircraft survey data over the Yellowstone 
National Park test site. These data have been kindly made available by 
Dr. W. H. Smedes, U. S. Geological Survey, Denver, Colorado. 


4.1 DATA DESCRIPTION 

These particular data were collected by the multispectral 12-channel 
scanner onboard an aircraft at the altitude of about 6,000 feet (ref. 7). The 
scanner resolution is 3 milliradians . Each scan line contains 220 ground 
resolution cells about 20 feet square. The multispectral scanner recorded 
simultaneously 12 channels of spectral bands reflecting from the earth's 
surface between 0.4 and 1.0 ym. These spectral bands are listed in Table 3-1. 
For the purpose of comparison with Purdue LARS's supervised classification 
results, only four channels were used, i.e., channels 2, 9, 10 and 12. These 
4 channels have been determined by LARS' feature selection program to be the 
optional channels (based on the divergence criterion) for this particular set 
of data (ref. 10) . 


4.2 PRELIMINARY DATA ANALYSIS 

Figure 4-1* shows a gray-scale video display of reflectance for channel 9 
(0.62-0.66 ym) over the area. Also shown in the figure is the ground truth 
survey. Containing water, bedrock, forest, kame, till, talus and cloud shadow 
over forest. Detailed physical descriptions of these terrain types are given 
in reference 10. It is clear that the terrain feature is very complex, and 
that many parts of the test site do not have clear-cut boundaries between 


* Figures 4-1 through 4-26 and Tables 4-1 and 4-2 are presented following the 
text at the end of this section. 
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different terrain types. This is quite different from the Purdue C-l Flight 
Line in which the boundaries between different crop types are very clear (see 
Section III). 

Figures 4-2 through 4-5 show the univariate probability histograms for 

these four channels for the data set from scan 200 through 500. Very few 

( 

distinct modes show in each histogram, which indicates that the spectral signa- 
tures from different terrain types overlap each other, and that more than one 
channel would be needed for discrimination among different terrain types. 

Figures 4-6 and 4-7 show the corresponding digital gray-scale plots of 
the test area in channels 2 and 10, respectively. Comparing these gray-scale 
plots with the gray-level video display, one can see clear correspondences for 
several main areas with large contrast. It should be noted that the complement 
of the numerical value with respect to 256 is proportional to the spectral 
radiance collected by the scanner. Hence, the larger the numerical number as 
indicated by the interval, the smaller the spectral radiance. Figures 4-8 
through 4-13 show the scatter plots between channels 2, 9, 10 and 12. The 
numeric 1 indicates the number of samples in each spectral cell to be between 
1 and 9; numeric 2 is between 10 and 19 and so forth. From these scatter 
plots, one can note that channels 2, 9 and 10 are linearly correlated, while 
channel 12 is not correlated with the other three channels. This implies that 
the terrain types possess quite different reflectance characteristics in the 
visible and reflected IR ranges. It is also noted that no distinct cluster 
is visible in these scatter plots, which, in turn, indicates the overlapping 
of spectral signatures of different terrain types as observed from the 
invariate probability histograms. 

Figure 4-14 shows a boundary map of the test area obtained by using the 
boundary enhancement principle (ref. 8). In this map, the symbol (•) indi- 
cates that the enhanced spectral difference among adjacent resolution elements 
lies between the mean enhanced value over the entire target area plus one 
standard deviation and the mean plus two standard deviations. The symbol (+) 
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indicates that the enhanced difference lies between the mean plus two standard 
deviations and the mean plus three standard deviations. The symbol (x) in- 
dicates that the enhanced difference is greater than the mean plus three 
standard deviations. Finally, the area with the enhanced difference smaller 
than the mean plus one standard deviation is left blank. In other words, the 
blank area implies a relatively homogenous region, while the area indicated 
by the symbol (x) has the largest spectral contract between adjacent resolu- 
tion elements. These boundaries are found to be in good correspondence with 
the gray-level video display in Figure 4-1. 

4.3 UNSUPERVISED CLASSIFICATION OF TERRAIN TYPES 

Figures 4-15 through 4-24 show the intermediate and final unsupervised 
classification maps of the test area by the composite statistical and K-mean 
technique. The purpose of presenting the intermediate results is to show 
how the composite technique performs at its various stages so that some types 
of automatic decision logic may be formulated and built into the present com- 
puter program to achieve a more autonomous unsupervised classification scheme. 

For processing the set of data, a maximum of 18 classes was initially 
specified for the statistical sequential clustering. The output from this 
processing after only one pass of the entire data set is a set of mean spec- 
tral signatures for 18 initial classes. (Note: If the K-mean clustering had 

been used, 17 passes of the entire data set would have been required to 
estimate the 18 initial cluster centers. Further, these initial cluster 
centers would not be as accurate as those obtained by the statistical sequen- 
tial technique). The choice of a maximum of 18 classes for the data was based 
on a rough examination of the video display of the test area (Figure 4-1) , 
and 18 classes were believed to be sufficient. Actually, the number is twice 
as large as the main terrain types indicated by the ground truth survey map 
(Figure 4-25) supplied by Dr. W. H. Smedes , U. S. Geological Survey. The 
study is presently underway on how to decide on a suitable number of initial 
classes for any given data set. This study will be presented in the final 
contract report. 
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The above mean spectral signatures of the 18 initial classes were input 
to the K-mean clustering program for further improvement. Figures 4-15 and 
4-16 show the classification maps after one and two iterations, respectively, 
of these initial classes. A comparison of these two maps shows that the 
majority of class 7 is grouped into class M in the second iteration. Other- 
wise, no noticeable change has occurred in the second iteration. From the 
ground truth map, one finds that classes M and 7 both belong to the same class 
forest. The very few changes between the first and second iteration classifi- 
cation results indicates that the cluster centers have very rapidly converged 
to their true locations in the color space. In turn, this may imply that the 
initial cluster centers obtained by the statistical sequential clustering 
using only one pass of the data are quite good indeed. Hence, by only three 
passes of the data sets, i.e., one for the statistical sequential clustering 
and two for the K-mean clustering, good classification of the data set has 
been accomplished. By the K-mean clustering, more than 20 passes of the data 
set would have been required and the clustering results would not be as 
accurate as those obtained by the composite technique. 

The ground truth map (Figure 4-25) does not give a resolution element- 
by-resolution element terrain type specification. Instead, it shows only the 
average percentage descriptions of terrain types. For example, one area at 
the upper left-hand corner shows 80 percent rubble and 30 percent forest (i.e., 
.7 R, .3 F). Thus, it is not possible to make an exact assessment of the 
classification accuracy. Furthermore, the ground truth map gives only nine 
terrain types. For the easier comparisons, the 17 classes resulting from the 
K-mean program were further reduced one class at a time to nine classes. The 
criterion used for merging classes is the simple Enclidean distance in the 
color space. In other words, first the pairwise distances of all the 17 
classes are calculated, and then the two classes which have the shortest 
distance among all possible pairs are combined. This process is repeated on 
the resulting 16 classes and so on. The classification maps for each of these 
merging processes are shown in Figures 4-17 through 4-24. The actual merging 
processes are summarized in Table 4-1. Two meeting arrows denotes the merging 
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of two classes at that particular stage of merging with the new symbol for the 
merged class given above the arrows. The single arrow denotes the change of 
class symbol, for example, from class 2 to class = at the 3rd stage of merging 
The latter change of class symbol is due to the computer program coding and is 
of no significance. 

The physical identities of each of the nine classes are determined for 
this case by comparing the unsupervised classification map (Figure 4-24) with 
the ground truth survey map (Figure 4-25). The result is shown also in Table 
4-1. In actual operation, the physical identities will be established by 
checking a small percentage of each class on site. As mentioned earlier, it 
is not possible for this set of data to make an exact assessment of classifi- 
cation accuracy. The overall accuracy is about 80 percent. This comparison 
was made by Dr. W. H. Smedes who has the detailed knowledge on this test site 
(ref. 4). The main misclassif ication came from mingling two terrain types - 
water and talus even prior to merging classes. The mean spectral signatures 
of water and talus are given below, as obtained from small areas in the test 
site, 

Ch-2 Ch-9 Ch-10 Ch-12 

Water 85.3 84.2 81.7 67.2 

Talus 77.3 75.5 82.1 50.1 

which are quite similar to each other for comparing with the mean spectral 
signatures of the other 16 classes before merging of classes (Table 4-2) . 

4.4 COMPARISON WITH SUPERVISED CLASSIFICATION 

For a better appraisal of the performance of the composite clustering 
technique, the unsupervised classification map (Figure 4-24) is compared with 
the supervised classification map obtained by Purdue University's LARS using 
the maximum likelihood method over the same test area (refs. 4 and 10). The 
supervised classification map is shown in Figure 4-26. The accuracy of the 
classification is found to be about 86 percent as also reported by Dr. Smedes. 
This accuracy is higher than the 80 percent by the composite clustering tech- 
nique. However, to obtain this higher accuracy, much human intervention and 
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manipulation were needed by (a) knowing where to pick the typical training 
areas for every type of terrain of interest, (b) classifying the entire set 
of data and calculating the classification accuracy, and (c) new training 
areas were selected when the accuracy was found to not be good enough. In 
contrast to this iterative processing with close human supervision, the unsu- 
pervised classification map (Figure 4-14 or Figure 4-24) were obtained with 
very little human intervention, only specifying the maximum number of initial 
classes to begin with the processing. The computation time required for both 
the LARS supervised and unsupervised composite classification methods are 
about the same. 
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Figure 4-2. PROBABILITY HISTOGRAM OF CHANNEL 2 


4-8 


NORTHROP SERVICES. INC. 


TR-1075 


sss:sassssr.:iiiss:sssssKss:5sssssassssss5sssssas»ss s ^ 
saaaauB5s:ssssssisssKSSsss:ssssssss::ssssss8S 




ssisisisiisssiisssssssiS 



iaig8iMaBaasaaiaaii BB^M~M 

■■■■SSSS!iSSSSSSSSSSSSSSSSSiiii|j1m 





■■■■■■■■■■■■■■■!«■■!!!! 


■■■■■■■■!■■■■■■■■■■■■■■ 

■■■■■■■■!■■■■■■■■■■■■■■ 

■■■■■■■■'■■■■■■■■■ami 








OATA RANGE 


Figure 4-3. PROBABILITY HISTOGRAM OF CHANNEL 9 
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Figure 4-4. PROBABILITY HISTOGRAM OF CHANNEL 10 
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Figure 4-5. PROBABILITY HISTOGRAM OF CHANNEL 12 
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Figure 4-11. SCATTER PLOT OF CHANNELS 9 AND 10 
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Figure 4-12. SCATTER PLOT OF CHANNELS 9 AND 12 
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Figure 4-13. SCATTER PLOT OF CHANNELS 10 AND 12 
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Figure 4-14. THE INVENTORY BOUNDARY MAP BY THE BOUNDARY ENHANCEMENT TECHNIQUE 
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Figure 4-1 5 . UNSUPERVISED CLASSIFICATION MAP AFTER ONE ITERATION WITH 18 CLASSES 
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Figure 4-16. UNSUPERVISED CLASSIFICATION MAP AFTER TWO ITERATIONS 
WITH 17 CLASSES 
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-19. UNSUPERVISED CLASSIFICATION MAP AFTER THE THIRD MERGING WITH 14 CLASSES 
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Figure 4-20 . UNSUPERVISED CLASSIFICATION MAP AFTER THE FOURTH MERGING WITH 13 CLASSES 
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Figure 4-21. UNSUPERVISED CLASSIFICATION MAP AFTER THE FIFTH MERGING WITH 12 CLASSES 
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Figure 4-22. UNSUPERVISED CLASSIFICATION MAP AFTER THE SIXTH MERGING WITH 11 CLASSES 


4-28 







NORTHROP SERVICES. INC. 


TR-1075 


300001 1U 122 222 3 >3° 3*559 985999666667 77 7 7808885959900000^1 H ?5|2 7 7 7 7 788 jeB9^999000CbU Hi ? 

I JS701 39 79135 791357513579H5 7913S791 35 7 9 135 7 91 >5791 >579 1 >579 1 > 5 701 >57' 9 11579 1H 79 1)5 791 *5 791 1179 1 TO 79 1 3* 701 >5 79 1 




5 200 
202 
705 

C 206 
208 
210 

* 712 
216 
216 

N 218 
>20 
222 
226 
72 6 
. 728 

L 230 
2 12 
279 
I 236 

2 38 
7*0 

H 752 
799 
296 

6 7 9 fl 
750 
752 

759 
756 

760 
760 
762 

765 

766 
768 
270 
772 
779 
276 
278 
700 
78? 
705 
286 
788 
?90 
792 
799 
296 
798 
300 
302 

309 
306 
308 

310 
312 
319 
316 
.310 
>20 
372 

329 

3 26 
378 

330 

• 332 
.339 
336 

. 338 
>90 
39 2 


380 

35? 

>59 

>56 

388 

>60 


Hi 


i/vm* witn wm 1 v- — 1 mwii/in t vm-*—i— . . . 99 . .---11 

tivitiKV/i iv iv 1 iv-v — t mm — vi- 1 vim mvt-vmv/* — j it 

8/ lit > it 1 1 1 vvvvi vv-v-i — v— 1 1 1 iivd/vvvvvi > |v-* — 1 !.*.«•• •■--••• •-mi 

(vmtuivi i vywvvi ivvi 11 1-1-1 vv 1 » >)vy m 1 v) m l--i i-*.*** *.*•• 

1 1 >8888 M IV) 1 1 >V) IVIVV-V— I I V) I I VI 1 1 ) VVI V I ) ) VVV I . *.•..*•••.•-•••.*......• — •-} 

l-m>ISI» > VVV I VIWII I l - I V I V- I VVV 1 1 — IVI 1 1 VI VV) VV — •*.*••• 9. . ****-I 

8/8)i8*888)))Vm>Vllt! — I — l-l-tl IVIII1VIIVI I VVV I 1 • — ••*.-••*. ..*.•••«*..• • •*« 

8V[3)I)8))83) ) ) V I VV VV VV 1 — ~!V 1 IVVI > V I V I I V 1 V I V I V/ — •! ...*.**.*.• •*.*• - 

8/ $8 >818 8 188 1 8IVIV) VVV 1 1 -VI 1- l-V- 1 VVVl V* IVI — V) V/*I I VVI 1 / — •••-*•••..•...• *.*-••..•. 

8//// 188 8 881 18)88) IVVI) — II !- — l-l VI V) 1881 )M>j/l*It VI It -•*•■*» . .*••...*-•••.•-- 1 1 1 1 1 1 1' 

I I 8V/ VV) 33 8 881888 8 ) V ) 8) I — I -V I I I I I 1 VV II V V VV t /// 1 I VVI I * I IVI I | V 

1 8 1888/ 1888! > ) 1 1 8 1 1 VV I V 1 1 1 1 ~ I V I) M 1 1 1 j VI 1 V-yV- . /I I V VI // I ~ 5 f v 

8 8V 8 88/1 I) ) 8 881 188) V) ) » VV-V IV I I) » VVI I 8 8 VV I I — / 1 •--•••••.*...••• . . . . *.*•*• ••-VI VI VVV 

8 8 8V I X I / I I8818)8)))VI I V I V I I l — - — 1 18)11 I VI VV— - •/ I II //••••••• 

88)88881 1)8816)811) I IVVI it I 111181)1 8 VVI •—/•-/-••/• — 

8 888 8 1 88//// II IV) I 1 VV 111 I V I — ! -VV I I 8888) 1 I I I -••• 11 *• — •• — •• 

) 88888X8 ) ////VVI V1V1VVI Will- I VVI )) I 8SV) 1 1 1 I -••///••••/• 

1818181 1881/// //V///// I- IVI — VWtII VII) VVVI | ••///••••••-• l 


I 

.*•••••-..*- VVVVI vyv- 

-*....*.. •-!*.*- vvv vwi v- 
- “ -* - — .•nvivvvvtt 

.•VWI VVV! 1 I 
VVVVVI III 


•••!)••••••////. ••«•..*,-••. .*•*.*- 

---I I****////**.*. . .***J-*-... . .*•- 


•- 


•-•••-••- I 

i ) i 888888 M8$/'l D 1 vi» 8* i ivvvilv 1 I I IVI 1 1 l- 

n 118888 88881 IVI 1 VV 88 8 V VV I V1V V I - - I V V I VV I II -••--•-..*•-•*- I - 1 I I .••..-//••••*• 

| ) 1)88888881 VVV/ I I V ) 888VVV ) 11V-VV1 1 I IVVI -/-•••-•...*•--•••• I !••••••////.•••• 

I 8 8V 88881888 I IVIV8I t 8M8 ) V ) V I II-VVVI-IVI 1 •••• — 

) 1 88(888 )888 )8) >8)8 IMM I / I VV V V /-- VV 1 I V I I -•--•-•-•*.•-.•••• . ...... 

) 1 188888)8 88 88888 88 8HH) /VV8 8-I — l-l I II/ — ////. 

) I 888888 8) 88 888 88 IttMkt/I >8) VI I- 1- I I— • •. ••••*• -•“11/7 / • •• 

1 I8888888V88 8V888I8I) 88/V8I III j V-/I-I •/-• — •••-■ ••••*.-/./ I / I -*•*. *-• 

1 8888 8X 18)8 8W8$*8W8I)XV-S|V//I 1-1-1/ II -* ••/-••-•••*.-•• . •« ••/- I I V . 

1 888888888 ) 3X8818 XX 888V IV-/ — - I • — !“• | ! •“*• • / *77?r.*** 

) 8 188888 88 88 Slit 8 8MX8/- I //I V VVI /- I J I I I V** •••••-•-••/•• I / 

8 | 888888888888188 XW V/ Ml / V I - I I — I - I V I I /••*.*- IVI l l ;}-• »-l 

) | 18888888) 8 88 8 (XX I////I-) I — IVI/ •••••••I VVI I • 1 1 I I -•-•-- J *{•/?*- 

») > 8* 888 8*8 1 *88 18X1 1! //V-I-- l-l -II I •-!-•••••/- I VVVI- 1 1 J* •-I--VI VI /• I -I 

) ) ) 88 888IS8888VM8//// 1 I V- I I — I I !•-•*. • — Mil VVVVI I I 1 /• l-V ) )// / I I 

1 I 1 18888 88 8888) I 8// IVVI 1 1 — !• I ••• . •! I - I- VVVVVI IVI t II — ••->/) 18//II 

) ) U8888 88*88) 8V1/l8VI/l‘---V-IV-l/ • V/ - I VV V I VVV 1 1 V/ -/ — I V8 8W* I / / V 

1 ) 88888 8888) 8>NI/!)8V l-l-l v • • • • *•• I V I • I VV VVVVV It V I T *-. . 

) ) 88888888M) 1WI/I88VIVV-! IV 

I 8 8881888 88 ) 888-VV8V- )V-1 — IV/* I VVVVVVV II J t j I • • . '“-vy*- I j J !!• 

88**888*88 VV I )VIVV8)DI-IV-* — •-!--• — •— VVI— IlVVVIVVtl I— If IVI 

*)181**U*V*IV8/V)I)VVV1IVI iv*.ii.iv/.-ivvvwlivi-i-i- IVI , 

tl88t)*18>lVlVW8V))>) )l -V1V-* I • IV*- VV*— I WWW | | — •••••*. **l V • *7 f - •• 

8)81) 

8 V 88 8 I 18888VVIV8IV))) ) ) I I — * — ■ — I — • —V /— I V/ •• -V VVVVV /-!/-• 

8888)))) 8 888 W8M 8/- 1)8)81 I — l-V* — V* — 1 • •• I VVV 8V •-/ - * . 


Vl-*«... 

I::--;:: 


- 1- 


-V8XX//IV 


1/ 

I/* — *..*..* 

11/ . •*.*... 

/V-* I /•/ 

IV-*-///. .* 

I /-I/*// 1 II /I I* 

-I///- — 


VVVl VI--I . 

VVVV I V* / 1 1 - - 

vvvvi*//— 

'VVVI*//V*-*. 
ivv*///l — •• — I*., 
vv-/////— ••--*.. . 

IV*///) /*.-• *.* 

II*//)) / *•• 

V-**/H/ * 

. v- * . / v/ *. . . 

-I V1*/**/l*. — ...* 

.-1 1 1-.. I /-*.-••-.*.. . 

I-//*. 


— I- 


• //• 
• //• 


-/•//• 


•/•/.. 
. . . -•■/.. 
.. .//•♦.**. 
....///..... 
*.*///*. ..*— 
••////... — -• 
.///•/... — •• 
*////•*...••• 
./////.**•«•• 
•////*. 






.. . ■-•••*.9 


... ///•••• 

.**-i!i /-i -/•/.. •//////••• 

IVlII- I- 1 I -// - •////////• 




• II--...* 9 

. 


I?}' I"/ 7 


l)8>) I IV V 88lwS)l/l 111--'- | I - - 1 - . I . I wt * -II-.Vl — II •*.-/••/.. 

MS8I1VI J8VV8888lN8//V))/-l I II — I — — — IVI-I I — I W - — 1 * • •••• — — * — I / • — ••*—- III-*. •//*/». 

888 I V I I'l ) tf 8 1 888lSvl /V I I -/— - l-ll I-— • — — I t 1 VV-* •• I I-*- -II I I IV *.*/////• . 

8 8 88W I M ) I 881 1 8H/ ! - I I — I I— —I l» IV/**-IV— l-l 1/ — I V 1 1 •*.*////• . 

8 )88V 1 18 88) 88888M/-/V I/-I--I-VI- •--!-) ) I ) /--IVV** III III— -I | VV- I * .* //// • . 

u V | 8) } J v VV ) ) 888 8W ( l/IVl I I l/V-l 1 - •-- V-V l)IIV-IIV*-IIVVII-****-- ••••-. ■--•-lljj-f . •*///••• 
88VVI IV8888V8 88 >Wll-lV-IVl-I-V-*-V- IV) )))))“ I I -*-VVV I- — *.*•*-••. I*. ••--JJIJII-***///*.. 
88 1 I l/lwt.8) ) 888 1 * 1/ I > / I ) 1 VVV*/-V !/-) ) ID ) VVV II | J v I II J •*.•••••.•••••-} I J j I •“ |- *•///•• • 

SV( | I 888 hW) 8 881 I 81 I-V-l I VI I IV I I VI II) ) I VVVVI — V) ) VVI I I *. •••-/..*•.—- I - I [/•-{/-•///« - - 

8 )f VV 8 lt> 88888 8 V)Ml/ VV-V) )-V I •• ) I I I I I I) ) ) ) VV I V I V- VVVVV I -••.••••/.*.•• I • I — I II • I I -•••//.. . 
I 8 { IWlilVlllHVlW l/Vl-VVM-/*- I I I I I ) » > ) ) • ) I VVVI I VVVVVI II *. ••■•-. *. *“l J V--V--/I - I . .//•/• 




/-• 95999 

I- — * 9999999 

1 •• 9.9999 

-•-- — — •II I I 9 I . 99991 
•--I-*-. ..*..999999.. 

— I —.*••*.. 9 • .99. .9 
— 999.. .9 

• 59. ..9 

• II - ■ 


I — II- 


8 888 888 *i 8 h 8 8 818)8*8 H I ) 8 I — *V) I I 18 ) I I 1 I ) II J } VVV VVVI 1 **l j I ••*-.... -IV I I VWV 

38818888x88181) -XV.W ) I I ) 1 V > 1 / V I I 8 ) 1 ) 1 I ) > I ) J ) I IVVI I 1 I • W I I I V I W I I 

88 S 88 W 888 H 88 IVI 8 VXW/I 1/ I I / — — I I 1)81 1811 I ) 1 ) I ) 1 IV/VIV/*“I I-**. ...•/IVI 1 I I IVV 
t 3 8888 WW 8888 Ml I 1**WI/-I I IV- I I 1)1)1) I )l I 1)1) > ) IVVI — I I *-I V •••• • . • • I I VV 1 VV IV I 
33338. ..W133V31V 488 |— /1VI/IV) )88881 ) )) > ))) ) ) I ) VVI — I I I -I -•• • ...*l I V) VVV IV I 

8 IWKVta'.WWWW ))VVV) 88 //I) 8 l! 11)8818)1)1)1)1)8) » ) I -7 j 7 I I - I * VVV I IVV V J V 

8i84W8Wk3«3.).W8 ) ) VV1HW/-V ) 8 V ) 88 ) 8 * ) 8 ) ) ) ) ) 1 > 1 I > > J I /- l i I -/-•*.-• ■ . *-VVVVVW I I I I 
. I* 8 xWtkMMM>< I W V8!4)/-) I ) )) 88 8 1 8 8 8 8 ) ) > 1) I ) I j J V-/- j ““ I “*■••*. • • • J W VV VV I 
.W 88 M 8 X 3 88 8 88 VV 8 WMI/ - I V) ) 8 ) ) 818 8 ) ) I ))) ) ) I V ) ) ) I -l-ll -••••*.5*-| I VV V j I J j I . . . 

• IkxWlll 8888 8 V) VVIWV/- V l)))>) 88 t))»))))))J ) V)V/--* * *1 IWI IV I j { I { IVV J 

3 88WXX3 3W3 88 33 I VV 88 //-V- M 8 l>» 8888 l>l)M)M IVVI -/•-•-• I I I VVV I I Jl J j I IV l 

8 8 WWwWWWW i 8 8 WV ) ) V lU // 1)1 V I > ) 1 8 8 * 1 » ) > ID ) ) 1 1 V I V/-« • I •-•*..*• IVV VV j } I VI II IV I... 
8«W8).*.«kl884VV8V88/-Vl- I VI M811198 III I II ) 1 I VI I-** — •!••*. •-VVI VV JV V I l YJVJ; 
8 «.hk.Mhk 8 8 8 8 8 W I VIW/-V 8 I I ID I 1 88888 ) 1 ) 1 ) 1 III [ V I *•--••—*.• /V WW VII IV I 1 
1 8 WtiWMk 8 8 8 8 8 IVVV 8 U 8 /I I I I I I - 1 V * 1 3 (88 I I I I ) I I I ) I V— — * — I - • V I VVVVVVV I V I I I I I II 

88 1WX8 8 X 88 8 8 ) V 18W1/V1I W-V I 1188*8)8!) >11111 — * . . . II VVV VV VV II WV - 

383.38838. 88V1W88/-) I IWV) ) I 18888181) ) I )) )V)V! ••• — ....-I ) VVVI V WWW Jill J J , , 
8 |h« 8 88«8»8888Vl X/-/I-I VV ) )) 188181 ) ) I ) I II I > I VI* — 1-. . . . - I I VVVI VVVVV VVV W {JVJI- 

888888818.888)1 )*/ — /-V-Vll 1)881 18)811 1 >1 )IW/* *. 

8)888888 8W 88)8) 1 8/---V/V) ) 88 l I ) ) )) 1 ) )) ) I I ) * V I — 

388888888883)1)8/ |VWI 8 ))))n)l*)l))l))VI- 

3333333333W818WW/-- I- I I ) 1881 1 1 )> I I )3)l ) I IV)V — 

8 8 18 8 8*3 (M 888 WW 8 / l l j-V I ) 888 | 1 )8 I 1 1 81 I 1 1 1 I ) .)./'“ 


• //• 


I | . . 



•I * 


-I* 

!-...« 

I* 

I-...- 

1 -/II 

I-/.. 

I- /.. 

II/.. 

II- .. 

III- . 

IV- -* 


599 

99999 

9999 

9.99 

999. 

99.9 

9.99 

99959 


.9 99 . 999 

. 59 

... ..99.999 

.*••/*... 9. ...59999 

.9. . .999999 

5999 

*.999. .9.9. . . .9.9.99 




*. .9.9999.99 . 
*.9. 9999. < ' 
*..9999. . . 


. 9 .9 . . 999 
. ..99.99 


. 999999 
9999999 

... 9999.9 . .999 

_ VWV WW) VVVVI IIWl I IV1I**. .*•-•• 59 9999. .999 

. . *VWI VVI WWW I DVD MIVI I *.*.••• 99.9.599 

.-VVVVVVV) VVVVVI I I VVI I I IVVI*..* <,99.9995 9 

*. 1 I VVV) VWV VV Wl VI VI I I 1 VV !-*.••• ....9.999.99999 

,/iii-v. ...... • VI IV) WWW VVVVV I II I IVI I 99.99.99995 

. . . I 8 81 81 88 88HM H-- ) I ) ) I 8 8* I » 18 > 118* M II ) I V-*- . .-I-**- J I I VVI VVVV VVVVVVV I 111-11 •••*.. ..6. ..9 9. . .999999 

8 ) 8) 8 381 18 8 88XX »/ — IVlV) 1***1181811**)) )V))I— . •• — ••-* I VV) > VVV I VVVV VVV I U1ll-l-*--****....9..9 r 

I I 83888888Nxl«iW//IVVV) I I 8888188) 8)88) I )V) I I I - • 

) 1 »*l*88*8Mn.MI I-) )V) I M881H888 M8D MW) I-** 

) 1 81 33 3 8 3 8W8.WXV-I I I ))) * 8 1 1 * I I 8 I * 8 I * M I II VV II • ■ 

) I 88338188WxXXl-V) ) I 1 8888883888488) >)) I IVIV/**. 

) 188883888 HWW)/1|1>18X888888881881))I)VV IV/- 
> 883383WHMMWX 1 I IWI l» 3*113131333 18 III 1 WV/- • 

) 8888833tWHxhW| I D*8-8iW*IW3**l*88) > | 1 )y/*** 

U!l88-WkJl2w)I//MM8U*i8Mi*Ul) ) I )!-****I**Vjnw)VVVviviyjvi | * 4^9999 

I { ) 888 8WMMWMW/- ) 3 V8 8* 8 1 It 3 8 8 1 1 )* 1! 1 ! ) /! • “ ! / ! VV ) VI VV ) W ? V? V I -DM 1 1 1 1 • 1 1 . I*. I*. 1 1 1 1 . 5 599 5999 


*/V)VV> > VVV VVVVV I VI III--/-** • 

*• I VW > ) VI VVV WWW I III-*.. — 

• -VVVI VVVV VVV VVV VI II I 1 I**. .--«•*. *....9999. 
..•-VVVI 1 V) I WVVVW II II It-.*.*--*****.*. ..9.9. 
. .-! vwi 1 v iwi wvwi nut********. ••••*.... . 9 . 9 . 

. . *-VI VVVVV WVVVW 1 MIDI-*. •*.••-■-.** • 

• *V I VVVVVVV VVVV I Vi I IVI I l-*-*-**. .-••••»•• 

. .*- VVVVV) VVVVVI I V I I II I 1 •/-.* 

.••VII I VVIWV VI VI VI VI 1 . *. • 

• -VV I VVI VVVVV VVVVVI VI I ••••• 

• vi 1 iwwi ivvwi 1 1 r 11 1 .*/• 

• •! / | - VWI VVI VVVVVVV II - l I •••.!« 

•-I/IW) VIVVIWI VlVI-l I I 1 - ■ 


. . .9999999 
.... 9995599 
.... 5.95999 

99595, 

.. .55555555 
..*. 5555559 
. . • .9955995 
■ . .555999 


516 

5)0 

520 

577 




-I ! — . »••*..•• 


• . --.J/.-* 


*.«-/• 


5 S 2 

565 

566 
55fi 
660 
562 
965 


6 75 
676 
6 78 
680 
502 


I I I 88383XWW-W/I III I 48V8 8 1*88 1 111 VI I »»/•••- I VI I I Wl VVVVI VI IV - •-! I --- 

I V 1 II 818 >UM«k/ I ))VlVlll88883M8)VMIVI***/VVl 1 VVVVI VVVI VVVV — ““““ 

V) 18888). 3*8/- I II I- IWI 188)8 l)*V))l IVI — */l ) V 1 VVVV) VI VVVVV! I 
VIV 8 I 888 .XWI/II/I II 18 ))l > 88 ) 8 ) )> I I I /-•••! VVI I VVV I IVV VWV V I ■ 

VVV3I 88 ) *W 8 / /V I I — V* I III 88 ) l-V) I I VI I — ••• I VVVVV II I VVVVI V I I • 

VVV) >1 8XXH8/-VV-- I Ml ) » ) 8 * 8 ) I I I I VV/--****I ) VVVVV- I VVVI I I !•• 

Will )88kN)/1 )l l-V) 18 11 I 111 V-) I IVV-* -.*-/- V/- VVV I1VIVII VV* 

v ) >»> ) iihww/n ivi n ) 1 1818V) vi > 11 vi/--**i-*vyi jvvy-yiyyiyy-* 

) )V8I8*XX«/-Ii!/1 I ) >8)8)811/1 8) ) I l-*-*-***Vly VVI I 1 VV f VI I .* 
i > imwwi/ 1 ill - 1 ) )*1818V8 / 1 1 81 1 /-••-•/••-• I yyyiyy yv/.. 

18 8V838x**8/M 8 IV 14)8 48 8 *M 1/1)81 VV- * •-• V-* •-- I V VI T I V I I I I - •• 

31888X-XXW/-V) 1 i 1 8814188W8/-) ) 11 -/--.J 1 i**-l y) y- yviyv*.** 

I I I88WWX-1/-)*) Dill) llxkl/l ) 1 V/— l-ll /••*/! V ) I I I !-! ►-•*.* 

) ) Hi txx./i-nvi i n 1 8 83 -xi- m/i — I WV-***1 VV)V/* — I .*.* 

V 1 ) I I )■>•>./ — 18VV18I 88483.81 /III Wl 1 •••- I 

VVVII 1W8I/-IV)VVI*8)888U8T-I IV-— I — V I V I • •• — 

VVVI I l** 8 W/-l 1)1 V 888 )) 83*3*11-1 )/— I I VI IVV-***-. 

VI )V) li.h4/-IVl I V 88 ) 1 8 8X3* 1 I-IV--I *IV) IVV) 

VV8I 88.X8--VVIV 1881 1) 8XW8V-VV • V V I VV V I -- * - • • • - I J J l |/. • .*-. 

/ V 1 V I 8 Wk I - I V V I V I 8 8 * *8 WMV I /- I /- I - M I WV I • — ■ — •••( | I T I I J***-*-. 

I M I 1 81-8/1)1 I 11 II MIXHX 8 I.I/-I I - V) IVVI- - « Y ! J I I 

i ) m »mv - 1 vivi m* i8»w*«i -/- 1 v/-vi ivi J m --.••*...*.. * 

vi i)8v8*//vvnvmn8«xwi--/-ii-iv)wiv-**.*-iv-n- 5 

VV18I88/* IVI IV) I I IllkWKl 1 — / — V — — I WV/-I I-***lyt-II*--*** 6 

) vmiHl/IWIVI )V))8Wi.3.vI*--/-I-V))/* — V— Iv* 1 1- ... 

I 1 1 ) 8 ta 8 I / - I I )V) IV 88 ) WfchS I/* I I- I- > )V*--/| • 1 y * - — -1 

I IIU3. I/I VVVI I I V) 88 *»k) -• Dll/ VUVy j-*-!!**-* — 

1 ) 8 13. 1 88 II ) D tv- Dlk-kil — I )V* — I VVVVV 

888)3. xk/D)lw 8 II) 8«.3*W8I-/*-VVIV8|/ — IVVVVVl DIM I j* 

1 83 8.W3W) ) 1 )« 8 iv 8 iv.x) IV --/11 iv) > /-* vv v vyy y- • 7 * • •- 7 --. 

) 3 3 . 3 X 1 / D 1 InWt-l V 8 — I II - II 1 III vl — vivy;p--i****i [--* 

I 8 3 I W ) V / 1 1 1 mil IIW3I1-/I3III 1 i/**-vyi II l -•••• 7*711 1 

31 fi 311/)3DI. W 8 IV 8 VVI V-/I»II3)/VI*.*-DI— *--*-111 1 

8.W.3/V ) VI I ) In* JV 8 MV//I 81)88) V/-****Vl *-*-y7 — 

I 8 W 88 /ID I I I 8*8 I 88 V/-/V) 77 i - 7 VD -*..*•* . 

8848//) l-VV*W))VlV--/V1))lll) IVI ••• — D*-iyi - 7"-****« 

WWMIIII )l 1 VIW 88 I-//-I »)- V)*t) iv-** 7 -*- 7 * 7 yl--| 7 / 6 ..... . 

..X/ - )8V)8)V388-I II I811V-8I )l I l-.-IW-l*IV/*-[l I — 

.. )/V8888t8x88VV) 188) IVI 18)8) ) [-.-VVVVI V !• — -I J 

.X*/ I *8 1 888188D I ) I 3111 )V8*3m**-l-lyD-*-7 I M I- 

t8V/lil*8tXXll)Vli))Vl8V))8IIV-**IV-*IV***7Dpi*.*..*..»..- 
!1/)48»818)1 XV IV)8*VVVIV888)1I**-D--I-»**ID! I**.*. •*..••*. 

3 )/)33)l)88)8ll I )| 8 )l VI 8 ) 88 ) IJ**-V-*J*****y[ I I ]•• 

8V/I888118888IVII 111) 8V 188) l)I**IV-*1****-l l**l *••••.«•«..*• 


.55656566 

...............566669665 

*• . . .. . • • . • •• ..995966969 

9 59999999969 

. . . i 99699999959695 

••••.......9.56955565565 

• 95 6565556555695 

• ••••a**.* 666665666556655 

• ••• ••••-• , *.6. 6.666666666666666 

9 5.55655655565 55 

• 

*. .**.•••. 66 .. . .6666666 6 6666666 

••••••*. *... 6 . .666656696666 6 666 

..••••.■a*... 69. 99 6669966666666 

.•••••*. *...6. 6666 66 66666665. .6 

• *. .*•*.••. .6. 6966 99 66 66 6666. .6 
•*..*-. 656 56 55 55565566. .6 

659555566565655 . .66 

. • • . .*. . .6. .6666666669666666666 

•...*»• 555555 55 65 55.55 6 5.. 

. , ..... . .6.66669666666666666 6 66 . 

... .5.. 55 55 5556555. .5555. 55 66... 

555565655 55 56 59 56555.65 569.. 

• •*.*.• 6. .666666 6 666666666666 66 6669666666... 


• 5 55555 55 55 595 6 56 5 56566*6555659 555 566. . * 

66 6. 6 .. 66 5 66 66 666 6 666 5 66696666 66 6666..* 

5599999999999 6 55 999955. •. 

9695595596995999595666. . . 
.96.9 45 66 65 6 69 96 69 666666669 6 

. • • ’ l . 1 5 6 . 6 . 56 5 5 55 566. 55 6 555 55 655'. I! I 



.*■....5.. 56. 66695569655666996959 <■.. 

• a**... 9 • .555695656566555666566666.. 


.696 


• 1 »-tr»i;ii.> l i:;o'i:>ni)U‘>fiOOCOi,'.r:CUOGCCO .QOO.l'H'OuOOOO'JdOCOOl I 11 1 I 1 1 1 1 I D D 1 l 1 1 1 11 11 D 1 11 1 I 111 1 11 111 l l M 11 111 .’7 7? '222.' V ’ 
„ ,111 , rv »7? ) 3 3 3 )6^66*S9‘,SSh<./»#.M7 7 77t)0»‘«ft‘»q999O'.l)OOl l 111722. ’7 D 3 3)66 66 6 ■, SS >, -y t ( :,/ t, 7 7 7 7 7 -3 » n R «>< )■) 9‘»0C 0*. 'M ! M 17 
M*.rM 3«. r* 3 S f 6 | I •» 7 9 1 3 *• f 9 1 3*./9| 3*.7«ms;n >5T91 3S791 3S7J1 IS 7 1 1 V* 79 l ) S 79 I >S 79 1 39 791 >9791 '9?9l 397‘il >9 7'M ’• 7«J1 


Figure 4-23 . UNSUPERVISED CLASSIFICATION MAP AFTER THE SEVENTH MERGING WITH 10 CLASSES 
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Figure 4-24. UNSUPERVISED CLASSIFICATION MAP AFTER THE EIGHTH MERGING WITH 9 CLASSES 
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Figure 4-25. GROUND TRUTH SURVEY MAP 
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Table 4-2. MEAN SPECTRAL VECTORS FOR 18 CLASSES - YELLOWSTONE NATIONAL PARK 


CLASS 

NUMBER 

CLASS 

SYMBOL 

NUMBER OF 
SAMPLES 

MEAN SPECTRAL VECTOR 

CH-2 

CH-9 

CH-10 

CH-12 

1 

) 

1120 

110.32 

135.89 

150.51 

83.4 

2 

I 

823 

91 .45 

107.25 

115.83 

81 .89 

3 

W 

836 

84.19 

96.25 

103.65 

76.18 

4 

• 

1625 

62.49 

58.83 

58.65 

58.80 

5 

V 

923 

93.29 

115.51 

128.92 

82.11 

6 

- 

1043 

76.32 

83.36 

88.19 

73.18 

7 

$ 

905 

116.27 

145.62 

163.33 

86.24 

8 

★ 

1326 

71.94 

75.05 

77.68 

69.70 

9 

/ 

433 

86.99 

86.66 

89.57 

56.18 

10 

4 

975 

121 .65 

155.46 

175.72 

88.62 

11 

M 

1518 

67.15 

67.50 

68.46 

67.66 

12 

+ 

866 

103.71 

125.70 

138.82 

82.40 

13 

H 

1272 

57.26 

51 .61 

50.94 

53.01 

14 

Z 

319 

136.69 

175.95 

204.94 

93.73 

15 

= 

779 

80.16 

101 .62 

116.34 

77.46 

16 

2 

1059 

53.86 

45.50 

44.74 

42.20 

17 

3 

828 

78.89 

89.27 

93.61 

85.93 


TOTAL = 16,650 Samples 
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Figure 4-26, 


LARS CLASSIFICATION: YELLOWSTONE NATIONAL PARK 

(CH-2 , 9, 10 AND 12) 
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Section V 

CONCLUSIONS 

In this study, a new composite statistical sequential K-means clustering 
technique has been developed. It was applied for automatic unsupervised 
classification of remote multispectral sensing data over the Yellowstone 
National Park test site and Purdue C-l Flight line. It was found that the 
classification technique is about 80 percent correct on both data sets, com- 
pared to 86 and 85 percent classification accuracy, respectively, obtained by 
the Purdue LARS supervised maximum likelihood classification method. In view 
of the very little human intervention required for the application of the 
unsupervised classification, the slightly lower accuracy seems still rather 
good. With these two demonstrations on actual data, it seems fair to assert 
that the new composite technique may be useful for processing various earth 
resources survey data. From the operational viewpoint, it is also believed 
that the unsupervised technique is more feasible than the supervised techniques 

There is still some automatic decision logic needed to be developed in 
the present unsupervised technique such as (a) to decide the number of classes 
merging optimally suited for any given data set, and (b) to examine the homo- 
geneity of every class established. These two decision logics are closely 
related and are important for establishing a completely autonomous nonsuper- 
vised classification system. The investigating of such decision logics and 
implementing them into the computer programs is presently underway. Effort is 
also underway to integrate the statistical sequential clustering and general- 
ized K-means clustering computer programs into a single, more efficient pro- 
gram for operation type data processing. The above developments will be 
reported in the future. 
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